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Abstract. We represent agents as sets of strings. Each string encodes 
a potential interaction with another agent or environment. We represent 

CNl ' the total set of dynamics between two agents as the intersection of their 

respective strings, we prove complexity properties of player interactions 
using Algorithmic Information Theory. We show how the proposed con- 
struction is compatible with Universal Artificial Intelligence, in that the 

l/") , AIXI model can be seen as universal with respect to interaction^ 
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Whereas classical Information Theory is concerned with quantifying the ex- 
pected number of bits needed for communication, Algorithmic Information The- 
ory (AIT) principally studies the complexity of individual strings. A central 
measure of AIT is the Kolmogorov Complexity C(x) of a string x, which is the 
^S ■ size of the smallest program that will output x on a universal Turing machine. 

j^* Another central definition of AIT is the universal prior m(x) that weights a 

f- •) ■ hypothesis (string) by the complexity of the programs that produce it |LV08] . 

This universal prior has many remarkable properties; if m(x) is used for in- 
duction, then any computable sequence can be learned with only the minimum 
amount of data. Unfortunately, C(x) and m(x) are not finitely computable. 
Algorithmic Information Theory can be interpreted as a generalization of classi- 
cial Information Theory [CT91 and the Minimum Description Length principal. 
3 ■ Some other applications include universal PAC learning and Algorithmic Statis- 

tics [LV08IGTV0I] . 

The question of whether AIT can be used to form the foundation of Artificial 
Intelligence was answered in the affirmative with Hutter's Universal Artificial 
Intelligence (UAI) |Hut04] . This was achieved by the application of the universal 
prior m(x) to the cybernetic agent model, where an agent communicates with 
an environment through sequential cycles of action, perception, and reward. It 
was shown that there exists a universal agent, the AIXI model, that inherits 
many universality properties from m(x). In particular, the AIXI model will 
converge to achieve optimal rewards given long enough time in the environment. 
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As almost all AI problems can be formalized in the cybernetic agent model, the 
AIXI model is a complete theoretical solution to the field of Artificial General 
Intelligence |GP07j . 

In this paper, we represent agents as sets of strings and the potential dynam- 
ics between them as the intersection of their respective sets of strings (Sec. [2]). 
We connect this interpretation of interacting agents to the cybernetic agent 
model (Sec. 12.21) . We provide background on Algorithmic Information Theory 
(Sec. [3]) and show how agent learning can be described with algorithmic com- 
plexity (Sec.@|. We apply combinatorial and algorithmic proof techniques |VV10j 
to study the dynamics between agents (Sec. [5j. In particular, we describe the 
approximation of agents (Th. [2]), the conditions for removal of superfluous in- 
formation in the encoding of an agent (Th. [3]), and the consequences of having 
multiple payers achieving the same rewards in an environment (Th. 0]) . We show 
how the interpretation given in Sec. [5] is compatible with Universal Artificial 
Intelligence, in that the AIXI model has universality properties with respect to 
our definition of "interaction" (Sec. [6]). 

2 Interaction as Intersection 

We define players A and B as two sets containing strings of size n. Each string 
x in the intersection set A n B represents a particular "interaction" between 
players A and B. We will use the terms string and interaction interchangeably. 
This set representation can be used to encode non-cooperative games (Sec. 12. ip 
and instances of the cybernetic agent model fSec. !2.2| ). Uncertainties in instances 
of both domains can be encoded into the size of the intersections. The amount 
of uncertainty between players is equal to \A n B\. If the interaction between 
the players is deterministic then |Afl-B| = 1. If uncertainty exists, then multiple 
interactions are possible and \A n B\ > 1. We say that player A interacts with 
Bii\Af]B\>0. 

2.1 Non-cooperative Games 

Sets can be used to encode adversaries in sequential games |RN09j , where agents 
exchange a series of actions over a finite number of plies. Each game or in- 
teraction consists of the recording of actions by adversaries a and j3, with 
x — (ai,&i)(a2,&2)(a3,&3) f° r a game of three rounds. The player (set) rep- 
resentation A of adversary a is the set of games representing all possible actions 
by a's adversary with a's responding actions, and similarly for player B repre- 
senting adversary /3. An example game is rock-paper-scissors where adversaries 
a and j3 play two sequential rounds with an action space of {R, P, S}. Adversary 
a only plays rock, whereas adversary j3 first plays paper, then copies his adver- 
sary's play of the first round. The corresponding players (sets) A and B can be 
seen in Fig. la. The intersection set of A and B contains the single interaction 
x — U (R,P)(R,R)" which is the only possible game (interaction) that a and j3 
can play. 

Example 1 (Chess Game). We use the example of a chess game with uncertainty 
between two players: Anatoly as white and Boris as black. An interaction x G 
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Fig. 1. (a) The set representation of players A and B playing two games of pa- 
per, rock, scissors. The intersection set of A and B contains the single interaction 
x = U (R,P)(R,R)." (b) The cybernetic agent model. 

AD B between Anatoly and Boris is a game of chess played for at most m plies 
for each player, with x = a\b\ai02 ■ ■ . a m b m = abi-. m - The chess move space V C 
{0, 1}* has a short binary encoding, whose precise definition is not important. If 
the game has not ended after m rounds, then the game is considered a draw. Both 
players are nondeterministic, where at every ply, they can choose from a selection 
of moves. Anatoly's decisions can be represented by a function Ja : V* — ► 2 V and 
similarly Boris' decisions by fs- Anatoly can be represented by a set A, with 
A = {cbi-.m : Vi< fc < m a k <E JA{d)i:k-i)}, and similarly Boris by set B. Their 
intersection, AC\B, represents the set of possible games that Anatoly and Boris 
can play together. 

Generally, sets can encode adversaries of non-cooperative normal form games, 
with their interactions consisting of pure Nash equilibriums [RN09], A normal 
form game is defined as (p, q) with the adversaries represented by normalized 
payoff functions p and q of the form {0, 1}" x {0, 1}™ — > [0, 1]. The set of pure 
Nash equilibriums is {(x,y) : p(x,y) = q(y,x) = 1}. For each payoff function 
p there is a player A = {{x,y) \ p(x,y) — 1}, and for each payoff function q 
there is player B. The intersection of A and B is equal to the set of pure Nash 
equilibriums of p and q. 

2.2 Cybernetic Agent Model 

The interpretation of "interaction as intersection" is also applicable to the cy- 
bernetic agent model used in Universal Artificial Intelligence |Hut04| . With the 
cybernetic agent model, there is an agent and an environment communicating in 
a series of cycles k = 1, 2, . . . (Fig. lb). At cycle k, the agent performs an action 
Vk £ y, dependent on the previous history yx < k — y\X\ . . . yt-iXk~i- The envi- 
ronment accepts the action and in turn outputs Xk S X, which can be interpreted 
as the fcth perception of the agent, followed by cycle k + 1 and so on. An agent 
is defined by a deterministic policy function p : X* — > y* with p{x < k) — yi-.k to 
denote output y\± — y\yi ■ ■ -Vk on input x <k = x\X2 ■ ■ ■ Xk-i- We use the terms 
policy and agent interchangeably. The inputs are separated into two parts, x^ = 
TkOk-, with rfc = r(xk) representing the reward and o k representing the observa- 
tion. We say r{x 1:m ) = Y^iLi r ( x i) an d we assume bounds on rewards with < 
r k < c for all k. There is uncertainty in the environment; it can be represented 



by a probability distribution over infinite strings, where n(x\ . . . x n ) is the prob- 
ability that an infinite string starts with x\ . . . x n . In Hutter's notation Hut04 , 
an underlined argument x k is a probability variable and non-underlined argu- 
ments Xk represent the condition with n(x <n x n ) = n(x_i :n ) / '^(x <n ) . The prob- 
ability that the environment reacts with x\ . . . x n under agent output y\ . . . y n 
is yL{y\x_i . . . y n x n ) — Li(yx n ). The environment is chronological, in that input Xi 
only depends on yx<iyi- The horizon m of the interaction is the number of cycles 
of the interaction. The value of agent p in environment ll is the expected reward 
sum Vlf n = J2 xi:m r ( x ^m)^('!&i:m)\yi: m =p{x < r n )- Tne optimal agent that maxi- 
mizes value Vf.'J^ is p^ = arg max p V[.'J^ , with value V*j£ = Vf.^ . The optimal 
expected reward given a partial history yxi.k is Vim(yxi:k)- 

It is possible to construct players (sets) A and B from the agent p and 
environment ll where A "interacts" (intersects) with B only if agent p can achieve 
a certain level of reward in ll. This construction enables us to apply the results 
and proof techniques of section [5] to the cybernetic agent model. To translate 
p and ll to A and B, we fix two parameters: a time horizon m and a difficulty 
threshold r. For every agent p, there is a player A^, with A^ — {yxi-. m ■ 
yv.m — p(x <m )}. There are several possible ways to construct a set B from 
an environment \x. One direct method is for every environment /i, to define a 
player B^ T , with B^ T = {yx 1:m : r(x 1:m ) > r, /u(jffi 1:m ) > 0}. Player B^ T 
represents all possible histories of fj, (however unlikely) where the reward is at 
least r. If A v m n B^ nT = 0, then environment \i is "too difficult" for the agent p; 
there is no interaction where the agent can receive a reward of at least r. We say 
the agent p interacts with the environment \x at time horizon m and difficulty r 
if Av m n B^ T ? 0. 

Example 2 (Peter and Magnus). We present a cybernetic agent model inter- 
pretation of chess with reward based players Peter and Magnus (same rules as 
example [l}. Peter, the agent p, has to be deterministic whereas Magnus, the 
environment /u, has uncertainty. At cycle k, each action y k is Peter's move and 
each perception Xk is Magnus' move. At ply m in the chess game, Magnus re- 
turns a reward of 1 if Peter has won. In rounds where the game is unfinished or 
if Peter loses or draws, the reward is 0. The player (set) A v m represents Peter's 
plays for m rounds. The player (set) for Magnus with difficulty threshold r = 1 
and m plies, B^ 1 is the set of all games that Magnus loses in m rounds or less. 
If A m n B^ x = 0, then Peter cannot interact with Magnus at difficulty level 1; 
Peter can never beat Magnus at chess in m rounds or less. If A m n B^ L /8 
then Peter can beat Magnus at a game of chess in m rounds or less. 

Another construction of a player Df^ T with respect to environment fi, is 
D^ T = {yxi-.m ■■ V fc V*£{yc\-k) /V*:£ > r}. With this interpretation, player 
D^ T represents all histories where at each time k, 1 < k < m, an agent can 
potentially achieve an expected reward of at least r times the optimal expected 
reward. If A^ O Df^ T = 0, then environment ll is "too difficult" for the agent 
p; there is no interaction where at every cycle k the agent has the potential to 
receive an expected reward of at least tV*.'J^ . 



3 Background in Algorithmic Information Theory 

We denote finite binary strings by x £ {0,1}* and the length of strings by l(x). 
Let the pairing function (•, •) be the standard one-to-one mapping from J\f x J\f 
to TV, where: (x,y) = x'y = l l( - l( - x »0l(x)xy and l((x, y)) = l(y)+l(x)+2l(l(x))+l. 
The Kolmogorov complexity C(x) is the length of the shortest binary program 
to compute 1011a universal Turing machine ip, C(x) = min{/(d) : ip(rf) = x). 
The prefix- free Kolmogorov complexity, K(x), restricts the universal machine ip 
so no halting program is a proper prefix of another halting program. For the 
rest of this paper, we use plain Kolmogorov complexity. Kolmogorov complexity 
is not finitely computable. The conditional Kolmogorov complexity of x relative 
to y, C{x\y) 1 is defined as the length of a shortest program to compute x, using y 
as an auxiliary input to the computation. The complexity of two strings x and y 
is denoted by C(x,y) = C((x,y)). The conditional complexity of two strings 
is C{x\y 1 z) — C(x\(y,z)). The complexity of information in x about y is I{x : 
y) = C(y) — C(y\x). The conditional mutual information is I(x : y\z) = C(y\z) — 
C(y\x, z) and can be interpreted as the information z receives about y when given 
x. The complexity of a function / : {0,1}* -^ {0,1}* is C(f) = min{C(p) : 
V x ip(p,x) — f(x)}. The Levin complexity is defined by Ct(x) ~ min p {l(p) + 
\ogt(p, x) : 4>(p) — x }i with t(p, x) being the number of steps taken by tf> until 
x is printed (without tp necessary halting). Levin complexity is computable. The 
complexity of a finite set S is C(S), the length of the shortest program / from 
which the universal Turing machine ip computes a listing of the elements of S 
and then halts. If S = {xi, . . . , x n }, then ip(f) — (xi, (x2, • • ■ , (x n ~i,x n ) ■ ■ •}}• 
The conditional complexity C(x\S) is the length of the shortest program from 
which ip, given S literally as auxiliary information, computes x. For every set 5* 
containing x, it must be that C(x\S) < log |5|+0(1). The randomness deficiency 
is the lack of typicality of x with respect to set S, with S(x\S) = log \S\—C(x\S), 
for x £ S and oo otherwise. If 5(x\S) is small enough, then x is a typical element 
of S] x satisfies all simple properties that hold with high majorities of strings 
in S. 

Example 3 (Anatoly's Games). Chess player Anatoly with function /a can be 
represented as a set A (see example [T]) . Set A is simple relative to Ja and the 
maximum number of plies m, with C(A\fA, m) = 0(1), where O(l) is the length 
of code required to use j 'a and m to enumerate all games x £ A. 

The following theorem, used in section [H shows that if a string x is contained by 
a large number of sets of a certain complexity, then it is contained by a simpler 
set |VV04j . The enumerative complexity, CE(J-), is the Kolmogorov complexity 
of a non halting program that enumerates all the sets F £ T . This theorem also 
holds for conditional complexity bounds, C(F\y). 

Theorem 1 ([VV04]). Let T be a family of subsets of a set of strings Q. If 
x £ Q is an member of each of2 k sets F £ T with C(F) < r, then x is a member 
of a set F' in T with C(F') < r - k + 0(log k + logr + loglog \Q\ + CE(F)). 



4 Player Strategy Learning 

Players A and B can learn information about each other's strategies from a single 
interaction (game) x € A f] B or from their entire interaction set (all possible 
games) A(~)B. The capacity of a player A is the maximum amount of information 
that A can receive about another player through all possible interactions, i.e. 
their interaction set. It is equal to the log of the number of possible subsets that 
it can have, log 2^1 — \A\. We define the lack of typicality of a subset 5 with 
respect to A to be S(S\A) = \A\ - C(S\A), for S C A and oo otherwise. 
Example 4 (Capacity). Boris B uses a range of black openings whereas Bill B' 
uses only the Sicilian defence. So Boris has a higher capacity, \B\ 3> \B'\, and 
can potentially learn more than Bill. 

Example 5 (Randomness Deficiency). Let A be the chess games played by Ana- 
toly. Bob is a simple player B' , who only moves his knight back and forth. Set 

5 = Ap\B' represents all A's games with B'. The randomness deficiency of these 
games, (5(5)^4), is high, as S is easily computable from A, with C(S\ A) <C \A\. 
Let T C A, in which T = A n B are games played between Anatoly and Boris, 
who uses a range of chess strategies unknown to Anatoly. Then S(T\A) is low 
and C(T\A) is high. 

If A views every interaction in AdB, the amount of information B reveals about 

itself is, I (A fl B :B\A), the mutual information between B and An B, given A. 

This term can be reduced to C(AnB\A)-C(AnB\A, B) = C(AnB\A) + 0(l). 

We define the amount of knowledge that A received about B from the interaction 

setp as: 

R{B\A) = C{AC\B\A). (1) 

The higher the randomness deficiency, 6 (A fl B\A), of an interaction set, An B, 
with respect to player A, the less information, R(B\A), player A can receive 
about its opponent B, with 

R(B\A)+S(Ar\B\A) = \A\. (2) 

Player A receives the most information about its opponent when the randomness 
deficiency is 8 (A n B\A) « 0. 

Example 6. Let Anatoly, A, and Bob, B 1 , be the players of example [5l Bob has 
a simple strategy and has a lower capacity \B'\ <C \A\, but he learns a lot from 
Anatoly, with 8 (A n B'\B') « and R(A\B') « |B'|. Anatoly learns very little 
from Bob, with i?(£'|A) w and 8 (A n B'|A) « |A|. 

Players can reveal information about themselves through a single interaction. 
The amount of information that A received about B from their interaction x is 

I{x:B\A)=C(x\A)-C(x\A,B). (3) 

A graphical depiction of the complexities relating to A, B, and x can be seen 
in Fig. [2] We define the lack of typicality of an interaction x with respect to 
both players to be 

8(x\A,B)=\og\Af\B\-C{x\A,B) (4) 



A (Anatoly) B (Boris) 

C(A|x,B) 




Ut/V / C(x|A,B) \y \su 

C(A|B) U><^--____---^<3u C(B|A) 

C(x|B) x C(x|A) 

Fig. 2. The complexities and information of A, B, and their interaction x. The rela- 
tionships hold up to logarithmic precision. 



for x £ AnB and oo otherwise. If 5(x\A, B) is small, then x represents a typical 
interaction. The information passed from player B to player A through a single 
interaction is represented by 

I(x : B\A) + S(x\A) = log \A\/\A nB\+ 6(x\A, B). (5) 

The information passed between players through a single interaction with 
the same capacity is 

I{x:B\A) +S(x\A) = I(x: A\B) + 5(x\B) + 0(1). (6) 

Example 7. Anatoly A plays a game x with Boris B who has the same capacity 
with \A\ = \B\. Anatoly tricks Boris with a King's gambit and the game x 
follows a series of moves extremely familiar to Anatoly. Boris reacts with the 
most obvious move at every turn. In this case the game is simple to Anatoly, 
with <5(x|A) being large and I(x : B\A) being small. The game is new to Boris 
with 8{x\B) being small and I{x:A\B) being large. Thus Boris learns more than 
Anatoly from x. 

If the players have a deterministic interaction, then A n B = {y} and the 
information A received from B reduces to I(y:B\A) + 6(y\A) = log \A\. 

5 Player Approximation and Interaction Complexity 

We show that, given an interaction x between players A and B, A can "con- 
struct" an approximate player B' that has interaction x using a small number of 
extra bits e, where C(B'\A, x) = e. We also show that the conditional complexity 
C(B'\A) of the approximate player B' is not greater than the amount of infor- 
mation I(x:B\A) that A obtains about B (up to logarithmic precision). We use 
the simplified notation log A — log | A | . We also use the player space notation, 
£>, to denote a set of sets of strings. 

Theorem 2. Given are a player space B and players A and B £ B over strings 
of size n with x <E AC\ B and C{B) = O(logn). Then there is a player B' G B 
with x e B', C(B'\A,x) = 0(s), and C{B'\A) < I(x : B\A) + 0(s), with s = 
\ogC(B\A)+logn. 



Proof. Let r = C(B\A). We define Q as the set of strings of size n, with 
log log |^| = logn. We set T = B, and so CE(T) = O(logra). Let N be the 
number of sets S G £>, with C(S\A) < r and x G S. We first show that 
C(B\A,x) < logA~ + O(lognr). There is a program, that when given x, A, 
£>, and r, with C(B,r) = O(lognr), can enumerate all sets in B containing x 
with conditional complexity to A being less than or equal to r. Thus B can be 
created using such a program and an index of size [log N~\ . By the application of 
Theorem[TJ conditional on A, with k = [log iVJ , there is a set B' G T with ieB' 
and C{B'\A) <r-k + O(lognr) < C{B\A) - C{B\A,x) + O(lognr) = I(x : 
B\A)+0(\ognr). To prove C(B'\A, x) = O(s), assume B' is the set satisfying the 
above properties that minimizes C{B'\A) up to precision O(s). It must be that 
C(B'\A, x) = O(s). Otherwise C(B'\A, x) — tu(s) and there is a set B" satisfying 
properties above and C{B" \ A) < C{B'\A)-C{B'\A,x)+0{s) = C(B'\A)-u(s), 
causing a contradiction. 

Example 8 (Opponent Reconstruction). Anatoly, A, plays a chess game x with 
Boris, B, with x G Ap\B. The players use a random string b of size C(x|A, B) to 
help decide their moves. Without using b, Anatoly can "construct" Bob, B' , an 
impersonator of Boris, using information from the game x and 0(logC(B\A) + 
\ogl(x)) bits. Bob can play the same game x with Anatoly. 

Given are players A and B who interact, in that A P\ B ^ 0. We show 
that there exists an interacting player B' that has complexity bounded by the 
mutual information of A and B. If theorem Q] can be strengthened such that the 
enumerative complexity term C_E(J r ) is replaced by CEE(F), the complexity of 
enumerating both the sets and the elements of the sets of J 7 , then the precision 
of theorems [3] and 0] can be strengthened with the replacement of the Levin 
complexity term Ct{A) with Kolmogorov complexity C(A). 

Theorem 3. Given are a player space B and players A and B G B with AnB ^ 
0. Then there exists a player B' G B with A n B' ^ 0, and C(B') < I (A : 
B) + 0(a), with s = logC(B) + logC t (A) + C{B). 

Proof. Let r = C(B), h = C t (A), and q = 2 C( > B \ We define Q = {(S) : C t (S) < 
r}, with (S) being an encoding of set S. This implies log log \Q\ = 0(logh). We 
define T with a recursive function A : B ->• J 7 , with X(S) = {(T) | C t (T) < 
h,S HT ^ 0}. It must be C(X) = 0(log h). The enumeration complexity of T 
requires the encoding of B and A, and so CE{F) = Oilog hq). Thus if (T) G A(5), 
then TH5 ^ 0. Let iV be the number of sets S € B, with 0(5) < r and SnA ^ 0. 
Thus 0(B|A) < logiV + O(logftgr), as there is a program, when given A, r, B, 
and an index of size [log JV] , that can return any such S. By the application of 
Theorem [1] with x = (A) and k — [log 7VJ , there is a set F G T with x G F and 
O(F) < r-fc+0(log/iqr) < C{B)-C(B\A)+0(log hqr) = I (A: B)+ O (log hqr). 
A set B' G i3, with X(B') = F, can be easily recovered from F by enumerating 
all sets in B, applying A to each one, and selecting the first one which produces 
F. So C(B') < C{F) + 0(logq) < I(A:B) + 0{\oghqr). Since (A) G X(B'), it 
must be that A<DB' ^®. 



We show that if a player A interacts with numerous players of a given com- 
plexity and uncertainty, then there exists a simple player B' who interacts with 
A with the same uncertainty. 

Theorem 4. Given are player space B, player A and 2 k players B £ B where 
for each B, < \A n B\ < c and C(B) < r. There is a player B' £ B such that 
< \ACiB'\ <c andC(B') < r-k + 0(s), with s = log C t {A) + logc + logfc + 
logr + C(B). 

Proof. Let h = C t (A) and q = 2 C( - B \ We can define Q C {0,1}* as a set of 
strings, each encoding a set (player) S whose Levin complexity is less than or 
equal to h. This implies log log | £? | = 0(logh). We represent the encoding of 

5 with (S). We define T with a recursive function A : B — > F, with X(S) = 
{(T) | C t {T) < h,0 < \Sr\T\ < c}. Thus it must be C(X) = 0(logch). The 
enumeration complexity of T requires the encoding of c, h, and B, with CE(F) = 
0(logchq). Thus if (T) e A(5), then player T and player S have a non empty 
intersection of size at most c. From the assumptions of this theorem, {A) is 
covered by at least 2 k sets X(B) £ J 7 of complexity C(X(B)) < r + 0(logchq). 
By the application of Theorem [TJ with x — (A) , there is a set F £ T with 
x £ F, C(F) < r - k + 0(log(c/ifcgr)). A set S' £ B, with A(B') = F can 
be recovered from F by enumerating all sets in £>, applying A to each one, and 
selecting the first one which produces F. Therefore C(B'\F) < 0(logchq) and 
so C(B') < C(F) + 0{logchq) <r-k + 0{log{chkqr)). Since (A) £ X(B'), it 
must be that < \AD B'\ < c, thus the theorem is proven. 

Example 9. An example application of theorem 0] is a game of the same form as 
example [2] Magnus, represented by set B, plays 2 k games of against 2 k young 
players A £ A.. Furthermore the players and Magnus are deterministic with for 
each A £ A, \A ("1 B\ = 1. The difficulty threshold r, is set to 1, so every one 
of the young players beat Magnus. By theorem HJ if all players A £ A have 
complexity at most C(A) < r, then there is a simpler player A' £ A that will 
win against Magnus, with C(A') < r — k + e (with e being of logarithmic order) 
and \A'DB\ = 1. 

6 Future Work: Universal Interaction 

Since the agents and environments of the cybernetic agent model of Section 12.21 
can be translated into set representations, there is potential application of the 
proof techniques used in Section[5]to Artificial Universal Intelligence |Hut04) , and 
in particular to describe properties of the AIXI model. The universal environ- 
ment, £, is defined using a form of the universal prior, m(x) = Ylp"tb(p)=x 2~ l ^ v \ 
representing a semimeasure (degenerate probability) over all infinite strings, with 
C(jfllifc) = S« 2 — p(WLi:n)- The universal environment £ is the weighted sum- 
mation over all chronological environments p. The term K(p) represents the 
prefix free Kolmogorov complexity of p. The AIXI model p^ is the optimal 
agent for the environment £ with horizon m, in that p^ n — arg max p Vf.'^ . The 
sequence of self optimizing AIXI agents for each time horizon is {p} }i=i,2,...- Let 



M be a set of environments where a sequence of self-optimizing policies p m ex- 
ists. The sequence converges to receive the optimal average for all environments 
with Vi/eM: ~V££'" ™-^° £V££. By theorem 5.29 from [Hut04j . it must be 

that the sequence of AIXI agents is optimal for M. with ^Vj^' — > m^i*m- 
We use the conversion of agents p and environments /j, to sets AP m and Df^ T as 
introduced at the end Section [2~2l The sequence of self optimizing AIXI agents, 
{pf }i=i,2,...) is universal with regard to interaction with respect to M. It is easy 
to see that for all r and all environments v S yVf, there is a number m„ r where 

for all to > m^x, A„r D D^ T 7^ 0. This implies a set representation of agent 
dynamics can be used to describe further properties of the AIXI model. There 
is potential for a deep connection, roughly analogously to how prefix-free Kol- 
mogorov complexity and the universal prior are related with the Coding Theorem 
K(x) = -logm(x) +0(1) USES]. 

7 Conclusions 

We used Algorithmic Information Theory to quantify the information exchanged 
between agents that interact in non-cooperative games (SecHJ. We have shown 
that an agent A can construct an approximation of his opponent B using infor- 
mation from a single interaction (game) with B (Th. [2]). We have shown that 
if an agent B with superfluous information interacts with an environment A 
and achieves a certain reward, then there exists another agent B' without this 
information that can achieve the same reward (Th. \S§. We have also shown that 
if multiple agents interact with an environment to achieve a certain reward, 
then there exists a simple agent who can achieve the same reward (Th.[4}. Our 
constructions are compatible with Universal Artificial Intelligence, in that the 
AIXI model can be interpreted as universal with regard to interactions with 
environments (Section |6|). 
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